AITopics | visual object detection

Collaborating Authors

visual object detection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FreeAnchor: Learning to Match Anchors for Visual Object Detection

Neural Information Processing SystemsDec-25-2025, 07:43:22 GMT

Modern CNN-based object detectors assign anchors for ground-truth objects under the restriction of object-anchor Intersection-over-Unit (IoU). In this study, we propose a learning-to-match approach to break IoU restriction, allowing objects to match anchors in a flexible manner. Our approach, referred to as FreeAnchor, updates hand-crafted anchor assignment to free anchor matching by formulating detector training as a maximum likelihood estimation (MLE) procedure. FreeAnchor targets at learning features which best explain a class of objects in terms of both classification and localization. FreeAnchor is implemented by optimizing detection customized likelihood and can be fused with CNN-based detectors in a plug-and-play manner. Experiments on MS-COCO demonstrate that FreeAnchor consistently outperforms the counterparts with significant margins.

freeanchor, match anchor, name change, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.62)

Add feedback

Lecture Video Visual Objects (LVVO) Dataset: A Benchmark for Visual Object Detection in Educational Videos

Biswas, Dipayan, Shah, Shishir, Subhlok, Jaspal

arXiv.org Artificial IntelligenceJun-18-2025

We introduce the Lecture Video Visual Objects (LVVO) dataset, a new benchmark for visual object detection in educational video content. The dataset consists of 4,000 frames extracted from 245 lecture videos spanning biology, computer science, and geosciences. A subset of 1,000 frames, referred to as LVVO_1k, has been manually annotated with bounding boxes for four visual categories: Table, Chart-Graph, Photographic-image, and Visual-illustration. Each frame was labeled independently by two annotators, resulting in an inter-annotator F1 score of 83.41%, indicating strong agreement. To ensure high-quality consensus annotations, a third expert reviewed and resolved all cases of disagreement through a conflict resolution process. To expand the dataset, a semi-supervised approach was employed to automatically annotate the remaining 3,000 frames, forming LVVO_3k. The complete dataset offers a valuable resource for developing and evaluating both supervised and semi-supervised methods for visual content detection in educational videos. The LVVO dataset is publicly available to support further research in this domain.

annotation, artificial intelligence, dataset, (13 more...)

arXiv.org Artificial Intelligence

2506.13657

Genre:

Instructional Material (0.47)
Research Report (0.40)

Industry: Education > Educational Technology > Audio & Video (0.82)

Technology: Information Technology > Artificial Intelligence > Vision (0.62)

Add feedback

Reviews: FreeAnchor: Learning to Match Anchors for Visual Object Detection

Neural Information Processing SystemsJan-23-2025, 07:24:28 GMT

I am raising my score to seven. The authors begin by noting that many existing object detection pipelines include a step on'anchor assignment', where from a large set of candidate bounding boxes (or "anchors") in a generic image frame, the one that best matches the ground truth bounding box, as measure by IoU, is chosen to be the one that is used for training, ie the object detection and bounding box regression outputs for that anchor will be pushed towards the ground truth. The authors note that for objects which don't fill the anchor well (slim objects oriented diagonally, objects with holes, or occluded objects) the best anchor according to this IoU comparison may be actively bad for training as a whole. The authors propose "learning to match", ie producing a custom likelihood which promotes both precision and recall of the final result (making reference to terms from the traditional loss function). For each ground truth bounding box, a'bag of anchors' is selected by ranking IoU and picking the best n. During training, a different bounding box is selected from this bag for each object, for each backwards pass.

anchor, ground truth, visual object detection, (10 more...)

Neural Information Processing Systems

Genre: Summary/Review (0.36)

Technology: Information Technology > Artificial Intelligence > Vision (0.82)

Add feedback

Reviews: FreeAnchor: Learning to Match Anchors for Visual Object Detection

Neural Information Processing SystemsJan-23-2025, 07:24:18 GMT

The paper presents a better loss function for anchor-based detection methods by matching anchors to GT boxes in a differentiable manner. Three reviewers recommend acceptance after a convincing rebuttal. The final decision is to accept.

freeanchor, match anchor, visual object detection, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.40)

Add feedback

FreeAnchor: Learning to Match Anchors for Visual Object Detection

Neural Information Processing SystemsOct-9-2024, 22:15:24 GMT

Modern CNN-based object detectors assign anchors for ground-truth objects under the restriction of object-anchor Intersection-over-Unit (IoU). In this study, we propose a learning-to-match approach to break IoU restriction, allowing objects to match anchors in a flexible manner. Our approach, referred to as FreeAnchor, updates hand-crafted anchor assignment to "free" anchor matching by formulating detector training as a maximum likelihood estimation (MLE) procedure. FreeAnchor targets at learning features which best explain a class of objects in terms of both classification and localization. FreeAnchor is implemented by optimizing detection customized likelihood and can be fused with CNN-based detectors in a plug-and-play manner.

freeanchor, match anchor, visual object detection, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

FreeAnchor: Learning to Match Anchors for Visual Object Detection

Zhang, Xiaosong, Wan, Fang, Liu, Chang, Ji, Rongrong, Ye, Qixiang

Neural Information Processing SystemsMar-18-2020, 20:18:51 GMT

Modern CNN-based object detectors assign anchors for ground-truth objects under the restriction of object-anchor Intersection-over-Unit (IoU). In this study, we propose a learning-to-match approach to break IoU restriction, allowing objects to match anchors in a flexible manner. Our approach, referred to as FreeAnchor, updates hand-crafted anchor assignment to "free" anchor matching by formulating detector training as a maximum likelihood estimation (MLE) procedure. FreeAnchor targets at learning features which best explain a class of objects in terms of both classification and localization. FreeAnchor is implemented by optimizing detection customized likelihood and can be fused with CNN-based detectors in a plug-and-play manner.

freeanchor, match anchor, visual object detection, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Implementing RoI Pooling in TensorFlow Keras

#artificialintelligenceMay-5-2019, 12:43:52 GMT

In this post we explain the basic concept and general usage of RoI (Region of Interest) pooling and provide an implementation using Keras layers and the TensorFlow backend. The intended audience for this post are people familiar with the basic theory of (Convolutional) Neural Networks and who are capable of building and running simple models using Keras. If you are here just for the code, serve yourself from this gist and do not forget to like and share the article! RoI Pooling was proposed by Ross Girshick in the Fast R-CNN paper as part of his object recognition pipeline. In the general use case for RoI Pooling we have an image-like object, and multiple regions of interest specified via bounding boxes.

artificial intelligence, machine learning, tensor, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)

Add feedback

Crowdsourcing Annotations for Visual Object Detection

Su, Hao (Stanford University) | Deng, Jia (Stanford University) | Fei-Fei, Li (Stanford University)

AAAI ConferencesJul-21-2012

A large number of images with ground truth object bounding boxes are critical for learning object detectors, which is a fundamental task in compute vision. In this paper, we study strategies to crowd-source bounding box annotations. The core challenge of building such a system is to effectively control the data quality with minimal cost. Our key observation is that drawing a bounding box is significantly more difficult and time consuming than giving answers to multiple choice questions. Thus quality control through additional verification tasks is more cost effective than consensus based algorithms. In particular, we present a system that consists of three simple sub-tasks --- a drawing task, a quality verification task and a coverage verification task. Experimental results demonstrate that our system is scalable, accurate, and cost-effective.

artificial intelligence, machine learning, social media, (17 more...)

AAAI Conferences

Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence

Genre: Research Report > New Finding (0.34)

Industry: Education (0.89)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback